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C/0 ; Abstract 

Let X be a n x p matrix and l\ the largest eigenvalue of the covariance matrix X*X. The 
"null case" where Xi j ~ A/"(0, 1) is of particular interest for principal co mponent analysis . 
For this model, when n,p — > oo and n/p ->7£ R^, it was shown in I Johnstone! l)200l|) that 
f-H , l\, properly centered and scaled, converges to the Tracy- Widom law. 

^0 ' We show that with the same centering and scaling, the result is true even when p/n or 

fH , n/p — > oo, therefore extending the previous res ult to 7 € R+- T he derivat ion uses ideas an d 

techniques quite similar to the ones presented in l.Tohnstonel l|200ll) . Following ISosrmikovl<j2002|) . 
we also show that the same is true for the joint distribution of the k largest eigenvalues, where 
k is a fixed integer. 

Numerical experiments illustrate the fact that the Tracy- Widom approximation is reasonable 
even when one of the dimension is small. 

> 

in 

■ 1 Introduction 

CO 
G\ 

I Large scale principal component analysis (PCA) - concerning an n x p matrix X where n 

and p are both large - is nowadays a widely used tools in many fields, such as image analysis, 
signal processing, functional data analysis and quantitative f i nance . Several examples come to 



O 



mind, including Eigenfaces, subspace filtering, or lLalonx et ail <ll999h where PCA (as well as some 



random matrix theory) is used to try to improve on the naive solution to Markovitz's portfolio 
G optimization problem. 

J> \ Important progress has been made recently in our understanding of the statistical properties of 

of the square of the largest singular value of a random matrix X under the "null model" where its 
entries are iid A/"(0, 1). Specifically if we denote the sample eigenvalues of X'X by l\ > . . . > l p , 
call 

rii = max (n,p) — 1 , Pi = min (n,p) , 



(V^n + VpI) ( -= + 



1/3 



and W\ the Tracy- Widom distribution (see AO), it was shown in Johnstone] (2(301) that 
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Theorem 1 (Johnstone) Ifn,p—> oo and n/p — ► 7 S (0, 00) 



Building on I.Tohnstonel ( 2001 ) and using properties of determinantal point processes, Soshnikov 



(2002) showed that the same result holds for the k largest eigenvalues, where A; is a fixed integer: 



their joint distribution converges to their Tracy- Widom counterpart. 

This is a very interesting development because the classical theory (e.g Anderson! ( 19841 )) was 



developed under the assumption that p was fixed and n grew to 00, whereas in modern day ap- 
plications both p and n are large. However, Johnstone's assumption n/p —* 7 imposes a limit on 
the validity of his result which one would like to remove. In an actual data analysis, with given 
p and n, n = o(p) and n x p could be equally plausible. Furthermore, a specific X of size n x p 
could arise in many triangular arrays settings, where we have Xj of size rij x pj, and the limitation 
rij/pj — ► 7 finite might only hold in some triangular situations and not in others. 

Accordingly in this paper we weaken the assumption that n/p — > 7 finite and show that 



Theorem 2 Ifn,p — > 00 and n/p — > 00, 



Wi . 



Moreover, with the same centering and scaling, the joint distribution of the k largest eigenvalues 
converges in law to its Tracy- Widom counterpart. 
Dually, the same result holds if n/p — > 0. 

Let us note that the remark we made about centering and scaling sequences after Theorem ^ is 
still valid in this context. 

There is clearly a mathematical motivation for dealing with this problem: the result completes 
the picture about the properties of l\ with large p and n and, in a sense, closes Theorem ^ But is 
it interesting from a statistical standpoint? 

The situation p 3> n is indeed a fairly common one in modern statistics. Microarray data are 
a prototypical example: currently they usually have p of the order of a few thousands and n of 
the order of a few tens. One encounters p > n or n > p in many other instances: data collection 
mechanisms are now effective enough so as to, for example, collect and retain thousands of piece of 
information for millions of customers (transactional data), or millions of pieces of information for 
thousands of stocks (tick- by-tick data in Finance) . Analyzing these very high dimensional datasets 
raises new challenges and is at the center of recent statistical work, both applied and theoretical. 

Microarray analysis in particular is a very active field, and has contributed a flurry of activity in 
non classical situations (very high dimensional data), raising theoretical questions and som etimes 
revisiting classical techniques or results. As illustrated for instance in I Wall et al. (2002), PC A 



or PCA-related methods are used for various tasks in the microarray context, from traditional 
dimensionality reduction procedures to gene grouping. Having a good understanding of the behavior 
of the singular values of gauss ian "white noise" matrice s could provide valuable insights for these 
applications. Recent work of Bickel and Levmal (|2003h about the properties of naive Bayes and 



Fisher's linear discriminant function when p n illustrates the impetus these dimensionality 
assumptions are also gaining in theoretical studies. Our work is part of the larger effort to investigate 
the properties of high dimensional data structures. Here it is done in a simple, "null" situation. 

We now present a few numerical experiments we realized to assess how big (or small) n or p 
should be for Theorems ^ and El to be practically useful. 
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1.1 Numerical experiments 

Johnstone] ( 2001 ) showed empirically that in that situation the Tracy- Widom approximation 



was reasonably satisfying, even for small matrices. Similarly, to try to assess its accuracy in our 
setup, we ran the following experiments in Matlab: we picked n and p and generated 10, 000 n x p 
matrices X with entries iid M(0, 1). Then we used standard routines (normest in Matlab) to 
compute thei r spectral norms and squared them to obtain a dataset of h-s. 
Following Ijohnstonel ((2003), we adjust centering and scaling to 



fl np = - 1/2 + y/p - 1/2 



1/3 



a np = (V" - 1/2 + \/P - V2) , + 



~ 1/2 yV- 1/2, 



This leads to a very significant improvement in the quality of the Tracy- Widom approximation for 
our simulations. Simple manipulations (explained in section 2.2) show that we have some freedom 
in choosing the centering and scaling: if we replace n by n + a and p by p + b (where a and b 
are fixed real numbers) in the definitions of u np and a np , Theorem ^ and Theorem |21 still hold. 
The particular choice used here is motivated by a careful theoretical analysis of the entries of 
mentioned in section 2.2. 

Table 1 summarizes the "quantile" properties of the empirical distributions we ob tained and 
compa re them to the Tracy- Widom reference. We used the same reference points as Johnstone! 



(2001|). 

We picked the dimensions according to two criteria: 100 x 4000, 30 x 5000, and 50 x 5000 
were chosen to investigate "representative" microarray situations. We chose the other to have a 
range of ratios and estimate how valuable the Tracy- Widom approximation would be in situations 
that could be considered classical, i.e one small dimension (less than 10) and one large (several 
h undreds to several thousands). For the sake of completeness, we redid the simulations presented 
Johnstone! (2OOI) d present in Table 2 the results obtained with jl np and ntering and 



m 



scaling. 

We see that the fit is good to very good for the upper quantiles (.9 and beyond) across the 
range of dimensions we investigated. The practical interest of this remark is clear: these are the 
quantiles one would naturally use in a testing problem. We note that it appears empirically that 
the problem gets harder when the ratio r of the larger dimension to the smaller one {p\ in our 
notation) gets bigger: the larger r, the larger p\ should be for the approximation to be acceptable. 

1.2 Conclusions and Organization 

From a technical standpoint, the method developed in Ijohnstonel d200lh proves to be versatile, 



and, at least conceptually, relatively easy to adapt to the case where n/p — ► oo. Nevertheless, 
substantial tech nical work is needed to o btain Theorem [21 Using the elementary fact (see e.g 
theorem 7.3.7 in Horn and Johnson] ( 1990h ) that the largest eigenvalue of X*X is the same as the 



largest eigenvalue of XX* , it will be sufficient to give the proof in the case n/p — > oo. 

From a practical point of view, we show that the Tracy- Widom limit law does not depend 
of how the sequence (n,p) is embedded. As long as both dimensions go to infinity, the properly 
re-centered and re-scaled largest eigenvalue converges weakly to this law. 

We can compare this with the "classical" s ituation w h ere y is held fixed, in which case the limiting 
joint distribution is known, too (see e.g Anderson! ( 1984! ) . corollary 13.3.2). In this case, the 



centering is done around n and the scaling is ^fn\ elementary computations show that — u np )/a np 
also has a non-degenerate limiting distribution (possibly changing with each p). Nevertheless, even 
with the classical centering, it is hard to evaluate the marginals in this context and the results are 
therefore difficult to use in practice. 



3 



TW Quantiles 


TW 


10x1000 


10 x 4000 


lOx 10000 


100x4000 


30x5000 


-3.9 


.01 


0.009 


0.010 


0.015 


0.012 


0.013 


-3.18 


.05 


0.047 


0.050 


0.060 


0.053 


0.055 


-2.78 


.10 


0.102 


0.107 


0.112 


0.103 


0.105 


-1.91 


.30 


0.303 


0.308 


0.316 


0.304 


0.303 


-1.27 


.50 


0.506 


0.506 


0.522 


0.508 


0.503 


-0.59 


.70 


0.705 


0.704 


0.723 


0.706 


0.702 


0.45 


0.9 


0.904 


0.904 


0.913 


0.901 


0.904 


0.98 


.95 


0.953 


0.951 


0.958 


0.951 


0.953 


2.02 


.99 


0.992 


0.990 


0.992 


0.991 


0.991 



TW Quantiles 


TW 


50x5000 


50x20000 


50x50000 


5x200 


5x2000 


5x20000 


-3.9 


.01 


0.010 


0.017 


0.021 


0.008 


0.014 


0.018 


-3.18 


.05 


0.053 


0.067 


0.079 


0.047 


0.057 


0.069 


-2.78 


.10 


0.104 


0.125 


0.139 


0.094 


0.110 


0.120 


-1.91 


.30 


0.309 


0.331 


0.345 


0.293 


0.314 


0.320 


-1.27 


.50 


0.502 


0.522 


0.538 


0.500 


0.506 


0.519 


-0.59 


.70 


0.705 


0.718 


0.727 


0.714 


0.712 


0.710 


0.45 


.90 


0.899 


0.905 


0.911 


0.911 


0.906 


0.907 


0.98 


.95 


0.949 


0.955 


0.957 


0.959 


0.951 


0.954 


2.02 


.99 


0.991 


0.992 


0.992 


0.994 


0.992 


0.992 



Table 1: Quality of the Tracy- Widom Approximation for some large matrices: the 

leftmost columns displays certain quantiles of the Tracy- Widom distribution. The second column 
gives the corresponding value of its cdf. Other columns give the value of the empirical distribution 
functions obtained from simulations at these quantiles. jX np and a np are the centering and scaling 
sequences. 



TW Quantiles 


TW 


5x5 


lOx 10 


lOOx 100 


5x20 


10 x 40 


100x400 


-3.9 


.01 





0.002 


0.008 


0.001 


0.004 


0.008 


-3.18 


.05 


0.003 


0.018 


0.043 


0.019 


0.032 


0.044 


-2.78 


0.10 


0.022 


0.054 


0.090 


0.056 


0.077 


0.095 


-1.91 


.30 


0.217 


0.257 


0.295 


0.262 


0.279 


0.294 


-1.27 


.50 


0.464 


0.486 


0.497 


0.490 


0.494 


0.489 


-.59 


.70 


0.702 


0.703 


0.700 


0.702 


0.707 


0.702 


0.45 


.90 


0.903 


0.903 


0.901 


0.905 


0.906 


0.899 


0.98 


.95 


0.949 


0.950 


0.950 


0.952 


0.953 


0.949 


2.02 


.99 


0.988 


0.990 


0.991 


0.989 


0.990 


0.990 



Table 2: Quality of the Tracy- Widom Approximation (Continued): the columns have the 
same meaning as in Table 1. The ratio p/n is smaller than in Table 1 and the matrices are not as 
big, but the Tracy- Widom approximation is already acceptable for the upper quantiles. ft np and 
a np are the centering and scaling sequences. 
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Our simulation results show that the Tracy- Widom approx imation is reason ably good (for the 
upper quantiles) even when p or n are small. As remarked bv I Johnstone! ( 200 il l. Proposition 1.2, 



this implies that when doing PCA, one could develop (conservative) tests based on the Tracy- 
Widom distribution that could serve as alternatives to the scree plot or the Wachter plot. 

The paper is organized as follows: after presenting (Section 2) the main elements of the proof 
of Theorem we describe (Section 3) the strategy that will lead to the proof of Theorem [2J We 
prove the two crucial points needed in Section 4. To make the paper self-contained, we give some 
background information about different aspects of the problem in the appendices. Several technical 
issues are also treated there in order to avoid obscuring the proof of the main result. 

2 Outline of Johnstone's proof 



Before describing the backbone of the proof presented in I.TohnstoTiel (|200lh . we need to in- 



troduce a few notational conventions. In what follows, we will use N instead of p to be consis- 
tent with the literature. We also denote by AB (for "asymptotic behavior") the situation where 
n, N, and n/N — > oo. We will frequently index functions that depend on both N and n with only 
N. The reason for this is that it will allow us to keep the notations relatively light, and that 
we think of n as being a function of N. Notations like E^r and Pjy will denote expectation and 
probability under the measure induced by the matrices (of size n(N) x N) we are working with. 
Finally, it is technically simpler to work with a matrix X whose entries are standard complex 
Gaussians (i.e the real and imaginary parts are independent, and they are both M(0, 1/2)), rather 
than with entries that are M(0, 1). When we mention the complex case, we refer to this situation. 

We now give a quick overview of the important points around which the proof of Theorem ^ 
was articulated. 

At the core of several random matrix theory results lie the fact that the joint distribution of the 
eigenvalues of the random matrices of interest is known and can be represented as the Fredholm 
determinant of a certain operator (or a totally explicit function of it). 

Building on this, if we introduce a number b that is 1 in the real case and 2 in the complex one, 
it turns out that one has the representation formula 

V N + f(h)^j = [det(Id + S N f)f 2 , (1) 

where Sjv is an explicit kernel, depending of course upon the kind of matrices in which one is 
interested. Here, / treated as an operator means multiplication by this function. It is clear that if 
Xt = — : x > t}, we have 



Pjv(Ji<t) = [det(Id + S NX t)\ 



6/2 



The interested reader c an fin d b ackgro u nd in formation on this in iMehtal (jl990r i. chapters 5 



and 6, Tracv and Widonl |l998l ) or iDeiftl (|2000h . chapter 5, which in turn (p. 109) points to 
Beed and Simonl (|l972h . section 17, vol 4, for background on operator determinants. We stress 



the fact that all these formulas are finite dimensional. 

From the last display, the strategy to show convergence in law in either Theorem HI is clear: 
fix sq, show that under the relevant assumptions, P(Ii,n < so) — ► Wi(sq), and use the fact that 
Wi is continuous to conclude. 

2.1 Complex case 

We just saw that to find the asymptotic behavior of l\ is equivalent to showing the convergence 
of the determinant of a certain operator. This task can be reduced to showing convergence in 
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trace class norm of this operator (see Reed and Simon] (Il972h for backg round on this, e.g, Lemma 
XIII.17.4 (p.323)). Through work from lWidoml (|l999h . I Johnstone! l|200lh exhibits an integral repre- 
sentation formula for his operator, and the original problem is essentially transformed into showing 
that certain integrals have a predetermined limit. 

In somew hat more det ail, if we call a = n — N , and the k-th. Laguerre polynomial associated 
with a (as in lSzegol (|l97fih . p. 100), let 



k\ 



(k + a)\ 



x a l 2 <r x l 2 L%{x) 



t,k{x) = (j>k(x)/x, (in = V Nn, and finally 



1>{x) = (~1) N ^f(VN£ N (x) - ^N-l(x)) • 

We note two thi ngs: first, there i s a slight abuse of notation since (ft and t/j obviously depend on n 
and N, but as in Johnstone] ( 2001). we choose to not carry these indices in the interest of readability. 
Also, (ft and ift admit more "compact" representations, in terms of a sin gle Laguerre polynomial, 
with a modified a, or another degree. These are easy to derive using Szegjj (|197fil ). p. 102, for 
instance. Nevertheless we choose to work (except in A7) with the previous representations because 
of the symmetries they present. 

The kernel Sm mentioned in ((J) has the representation ( Johnstone] ( 20011 s ). equation (3.6)) 



Sn(x,v) 



(x + z)ift(y + z) + ift{x + z)(ft(y + z)dz 



Now let S be the Airy operator. Its kernel is 

Ai(x)Ai'(y) - Ai(y)Ai'(x) 



S(x,y) 



x-y 



Ai(x + u)Ai(y + u)du , 



where Ai denotes the Airy function. It was shown in iTracv and Widoml l|l994h that, viewing S as 
an operator on L 2 [s, oo), one had 

det(ld - S) = W 2 (s) , 

where W2 is the Tracy- Widom law "emerging" in the complex case (see AO). So the complex analog 
of theorem U follows from the fact that, after defining S T (x,y) = ctnSn(ij>n + ^nx,^n + cnu), 
Johnstone managed to show, for all s, that 

det(Id-5 T ) ->det(Id-5) . 

To do this, he introduced (ft T (s) = on4>{^n + son), and similarly ift T . Note that we have 



S T {x,y) 



(ft T (x + z)ift T (y + z) + ift T (x + z)(ft T {y + z)dz 



Since what we are interested in is really S T Xsi for some fixed s, we will view S T as an operator 

acting on L 2 [s, 00) in what follows. 

So the problem becomes to show that, as n, N — > 00 



(ft T {s),lft T {s) -> ^ Ai ( S ) 1 

and that Vso £ R, there exists Nq(sq) such that if iV > Ao, we have on [sq, 00) 

cft T (s),Ms) = 0(e- s / 2 ). 



(2) 



(3) 
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Once this is shown (we give more details on this later), we can show that S T — > S in the trace class 
norm of ope rators on L 2 \s, oo ). A classical way to do it is described in the remark at the end of 
section 3 of Johnstone] ()200ll ). which bounds the trace class norm of the difference of S T — S in 



terms of the Hilbert-Schmidt norm of operators whose kernels are related to 4> T ,ip T and Ai. This 
leads to the conclusion that 

det(Id-5 T ) ->det(Id-S) , 

since det is continuous with respect to trace class norm. Therefore, the largest eigenvalue of X*X 
has the behavior it was claimed it has. 

2.2 Real Case 



In the real case, using arguments from iTracv and Widoml (jl99(t l and I Wi Horn! (jl999n . l,Tohmstoripl 



B gets a representation smhlar to ©, this time involving o P er^o71vit~m77772 
matrix (instead of scalar in the complex case). He is then able to relate it to the complex case 
problem - the matrix operator determinant can be computed as the product of two scalar operator 
determinants - and shows that the "reduced" variable he works with ought to have the same limit 
as it had in the Gaussian Orthogonal Ensemble case, which was studied in depth by Tracy and 
Widom. 

For the sake of completeness, we recall that in this situation a = n — 1 — N and 



Ptf(*l < t) = Vdet(Id + K NX t) 
Kn has the representation (in the ./V even case) 



K N 



Sjv + ifr <S> e(f> SnD — iP®4> 
eSjv — e + eip (g> e<j) SV + ecj) <S> tp 



where D is the differential operator, e is convolution with the kernel e(x — y), and e(x) = sgn(x)/2. 
We note the slight change in a and replace n by n — 1 when we need to use the results or formulas 
derived in the co mplex case (for instanc e, the SV we just mentioned is S n -i t N, and not S nj jv). We 



refer the reader to lGohberg et al. (2000) for a complement o f information on operator determinants 



and to the end of section VIII in Tracv and Widoml ( 1996h for details on the technical problems 
that Kn poses. 

From a purely technical standpoint, one critical issue is to evaluate the large n, N limit of 
c <t> = Io° ( f ) ( x )dx/2. If one can show that it is l/v2 when N — > oo through even values, then 
Johnstone's considerations hold true all the way and we have the same conclusion as in Theorem 
□ 

W e note tha t using the interlacing pro perties of the singu l ar va lues (as mentioned for instance 
in ISoshnikov (|2002l ). Remark 5; see also iHorn and Johnson! (|l990h . theorem 7.3.9), as well as the 



estimates of the difference (resp. ratio) between two consecutive terms of the centering (resp. 
scaling) sequence, the N odd case follows immediately from the N even case. To be more precise, 
we use the fact that 

Vn,N-»n,N-l = q^-1/3) _^ Q ag „ ^ ^ 

to check that the N even terms lower and upper bounding the N odd probability have the same 
limit. Note that the same relationship holds for fJ- n +a,N+b an d fJ- n ,N, if o and b are fixed real numbers. 
Therefore, after doing the proof with centering sequence ^ n +3/2,N+i/2 (which is technically simpler), 
we will be able to conclud e that the t heore m holds true for fi rii N- 



Last, to be able to use ISoshni kov (2002), Lemma 2, which gives the result we wish for the joint 
distribution of the /c-largest eigenvalues, we will need to verify that the entries of the 2x2 operator 
converge pointwise, and are bounded above in an exponential way. This is what is done in the 



proof of Lemma 1 of Soshnikov (2002), and we will show in A8 that the arguments given there can 



be extended to handle our situation. 
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3 Further Remarks and Agenda 



Most of the work in I.Tohnstonel feomh is done in closed form, and in the finite dimensional case. 
That has two advantages from our standpoint: as the limiting behavior is only investigated in the 
last "step", most of the arguments given there carry through for our problem, and the method 
certainly does. 

Therefore, our contribution is mostly technical; it follows very closely the ideas of Johnstone! 



(zuui j, providing solutions to tecnmcai problems appearing m tne c ase we consider, unly at a lew 
points could we not use the approach developed in I.Tohnstonel (|200lh . This led us to an analysis of 
the complex case that is slightly different from the original one, but the core reasons for which the 
result holds are the same. 

In what follows, we first focus on showing that © and © hold true when n, N and their ratio 
tend to infinity. This takes care of the complex case. We then turn to the proble m of the asymptotic 
behavior of c<^, and the technical points we have to verify for ISoshnikov (2002) results to hold. 

The following remark s outline the differences between the analysis we present here and the one 
done in Johnstone! (|200lh . 



3.1 Remarks on adaptation of the original proof 

3.1.1 Complex case 

To show that © and © held trne. l.Tohnstonl (|200lh essential ly reduced hi s problem to studying 
the solution of a "perturbed" Airy equation and used tools from Olver ( 19741 ) to carefully study it. 
One point that was used repeatedly was that the turning points of the equation were bounded away 
from one another when n, N were large. This is not true anymore in the case we consider, and we 
show how to get around this difficulty. So we do not work with a perturbed Airy equation anymore, 
but rather with Whittaker functions, which have a close relationship to Laguerre polynomials, and 
their expansion in terms of parabolic cylinder functions (see A9 for some background information 
on special functions). In lOlvei (1980), the case we ar e inter ested in was studied in detail, giving us 
most of the tools we need to show ©. Using Olver (jl97fj ). we reinterpret the parabolic cylinder 
functions results in terms of Airy functions and derive the elements we need to complete the proof 
of © and ©. 

The reason for which we cou ld not exac t ly foll ow the "original" method is related to the error 
control function called V(£) in Llohnstonel (fcOQll ). This function depends upon the parameter 
u = 2X/k, which in the case n/N —* 7 6 1R is bounded away from 2. This essentially allows a 
uniform control over V, and it is possible to show that this error control function is bounded as 
a function of N. S ince the control i s actually something like exp(AoV/K) — 1, it tends to zero as 
N — > 00. This gave Llohnstonel (|200lh a way to get part of ©. 

In our case, it seems that V would tend to 00, at a rate that is nevertheless o(k). As it seems easier 
and more promising to use Olver (1980) than to derive the growth of V, we choose this approach. 
Nevertheless, this is the only (but crucial) technicali ty (in the complex case) that did not carry 
through by the method described in Llohnstonel (j2£)01) under AB. 



3.1.2 Real Case 

For the problem, we provide a closed form expression at given n, N and show that in the 
limit is the "right" one as long as n and N tend to 00.. This does not use the saddlepoint method, 
but relies on the availability of a generating function formula for Laguerre polynomials. The proof 
is done in A7. 

A simple modification to Johnstone] (|200ll ) would give the same result: in the display preceding 
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(6.13) there, we could write 

oo 

h(t) = c k t k = 2 Q / 2 r(a/2)(l + t)(l - t 2 )-( a / 2+1 ) 

k=0 

and expand (1 — t 2 ) _ ( Q / 2+1 ) . Multiplying by 1 + 1 has a very simple effect on the series, and so Ck 
is known explicitly. 

In A8, we show how to check that the conditions required for Soshnikov's results to hold are 
indeed met. They are straightforward consequences of the analysis we will carry below. 

Since the real case is derived from the complex one after analyzing a few technical points, we 
verify these in the appendices and present here the study of the complex case. We now turn to the 
main problem we solve in this note: showing (j2j) and (j3J) under our set of assumptions. 



4 Complex case: study of asymptotics 

In this section, we work on the problem of showing pointwis e convergence and uniform bound- 
edness, setting the problem in a way similar to section 5 of I.Tormstonel fcOQlh . We recall his 
notations, slightly modified to avoid confusions: N+ = N +1/2, n+ =n + l/2, z = /Xjy + CJV S > with 

UN = {yWTah + VN+) 1/2 and a N = (y/(N + a)+ + v^XVv^ + W( N + «)+) 1/3 - For 
reasons that will be transparent later on, our aim is to show that 

F N (z) = (-l) N a- 1/2 ^N^.z( a+1 V 2 e- z / 2 L a N N (z) - Ai(s), Vs G M , (4) 

and 

Fn(z) = 0(e _s ) uniformly in [sq, oo), sq G M . (5) 
The scaling is sl ightly different from the original proof: iV^ 1 / 6 has been replaced by c^ 1 ^ 2 . As 



m 



Johnstone] ( 20011 "). we focus on wn(z) = z^ a+1 ^ 2 e z / 2 L c ^ } (z), which satisfies 



d 2 w (\ k A 2 -1/4 X 



dz 2 \ 4 z z 2 

where k = N + (a + l)/2 and A = a/2. Remark that under AB 44> n, N, n/N — > oo, k ~ A. Our 
strategy is to reformulate the problem in terms of so-called Whittaker functions, denoted Wi 
and to use the extensive available studies of these functions to show (j2J and (j3J). Temmel (jl99 
formula (3.1) p. 117 shows that 

(-1) N 

Wn{z) = nT w ^ z) ■ 

From now on, we will closely follow loiverl |l98fJ) . Let us remark that 

F N (z) = o N l/2 n L=W K +{z). 
We fix sq £ an d we work only with z = + cats, where s > so- 



Preliminaries Following lOlverl (|198(l ). we introduce I = k/X, (3 = y / 2(/ — 1), and the turning 
points x\ = 21 — 2y/l 2 — 1, X2 = 21 + 2\Jl 2 — 1, after the rescaling x = z/X. We remark that the 
two turning points coalesce at 2 under the hypothesis AB. In the new variable x, we have 

d 2 W /, , x 1 



dx 2 V 4x 2 
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where g(x) = ( x ~ x i)&~ X2 ) _ Using the ideas explained in I.Tohnstonel ((20011 ) . we shall be - eventually 
- interested in the asymptotics for z = fijy + a^s, or x = z/X = X2 + uns/X of F^{z). Let us now 
define an auxiliary variable v by 



( r 2 _ £2)1/2^ 
(/3 2 _ T 2)V2 dT 



5 1/2 (t)^ 

\-g) 1/2 {t)dt 



if X2 < X < oo , 
if Xl < X < X2 ■ 



■n 



We limit x to this range because of the technically important following point: <tn/X tends to 
zero faster than X2 — x\ does, and so, when s is bounded below, x will stay in the range [x\, oo) 
for all N greater than a certain No. This is shown in A2, alon g with t he clo sely related fact that 
we ca n focu s on v > 0. Our analysis is based on section 3 of lOlver where he builds on 

Olverl dl97.4 l. in which he expands Whittaker functions in terms of parabolic cylinder functions. 



The condition v > is critical, since Olver's expansions depend on the sig n of v . Ther efore, A2 
entitles us to focus on only one specific form of these. From (3.10) p. 219 in lOlverl dl980h . one has 



W K , x (Xx) = (2A) 1 / 4 {A(2 + /? 2 /2)/e} A(1+/32/4) x 



P 2 



c 2 - 4lx + 4 



1/4 



x 1/2 {U(- - A/3 2 , vV2X) + £l (A 2 , f3 



where, if E and M are the w e ight a nd modulus functions associated with U in lOlvei jl97ft l (p.156), 
we have, according to Olver (|l980h (3.11) p. 219, 



ei(A 2 ,/3 2 ,u) = E 



-i 



{~X(3 2 ,vV2X)M(-^XP 2 ,vV2X) 0(A- 2 / 3 ) 



(6) 



uniformly with respect to (3 € [0,B] and v € [0, oo), B being an arbitrary positive constant. 
We recall that the main relationship between U, E and M : for b < and x > 0, \U(b,x)\ < 
E- 1 (b,x)M(b,x). 

We now show that we have uniform boundedness on [sq, oo). The pointwise convergence result 
will be a straightforward consequence of the arguments we need to develop to solve this first 
problem. 



4.1 Uniform Boundedness 

Following up on the previous displays, if n, N are large enough so that v > 0, we have 

\W K , x (Xx)\ < (2A) 1 /4 {A (2 + /5 2/2)/ e }W/4) x ( f_ \ V4 ^me-i^ + Q^)) , 

(7) 

where we omitted the argument (— \ A/3 2 , f\/2A) for readability purposes. Our plan is now to 
transform this upper bound into a somewhat similar one, involving the modulus and weight function 
associated with the Airy function, which have the advantage of having only one parameter and 
known asymptotics. 

To carry out this program, we need to split the investigation into two parts: first s > or 
v > p. This will allow us to find an s% > such that Fn(z) = 0(e _s ) on [2si,oo). In the second 
part, we will just have to consider the case s £ [so,2si], and show that is merely uniformly 
bounded on this interval. 
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4.1.1 Case s > 

In order to use the resul ts link i ng pa rabolic cylinder functions and the Airy function (proved 



in 



Olver (1959) and cited in Olver ( 19751 ^. let us define yet another auxiliary variable, rj, by 



%3/ 2/3 2 = / g l/2 (t)dt _ 



3 jx 2 

Then, if we call £ and A4 the weight and modulus functions associated with the Airy function, 
we have, as shown in A3: 

E-\~\p 2 ,vV2\) < ( S- 1 (A 2 / 3 /? 4 / 3 r ? )(l + 0((A/? 2 )- 1 )) , 

i v2n l/A (na+w 2 )m) 1/2 i3 l/2 

M(--A/3 2 ,fv / 2A) < 



2 H ' ' - (A/3 2 )Vi2 



^ /4 A4(A 2 / 3 ^/ 3 t 7 )(1 + 0((A^ 2 )- 1 )) . 



Whence, if we call 9 = A 2 / 3 /3 4 / 3 ??, 

\F N (\x)\ < Kn^x^^/ix 2 - Alx + 4)) l ^£- 1 (9)M(e) (l + 0((A/3 2 )- 1 V A" 2 / 3 )) . 

In A4, we show that K n jy ~ 2 2 / 3 (/V/n) 1 / 4 under AB. From now on, A will denote a generic 
constant; its value may change from display to display. As long as x > X2, or s > 0, we have 

\F N (Xx)\ < A(N/n) 1/4 x 1/2 ( V /(x 2 - 4lx + 4)) 1 ^£' 1 {9)M(9)) . 

Now using the fact that (see lOlverl \l974h . chap. 11) x 1 ^M(x) < A, £~ l (x) < A exp(-2x 3 / 2 /3) 



for x > and A/? 2 = 2JV + 1, we get the new inequality 

//V\ 1/4 / r 2 \ 1/4 

m*)\ < a (-) ^ 1/6 (^ten) »p(-(^ 3/2 )/3) ■ 

In A5.1, we show that there exists si such that if s > 2s\, (2# 3 ' 2 )/3 > s. Also, as shown in A6.1, 
if s > 0, (7 is positive and increasing in x (or, equivalently, in s). Since the rational function of 
x appearing in the previous display is just (4</(x)) -1 / 4 , we can bound it by its value at x(2si) on 
[2si,oo). In A6.2, we show that, at s fixed, under AB, we have 4g(x) ~ fiajys/X, and using the 
equivalents mentioned in Al, we have a^P/X ~ 4iV 1//3 /n, from which we conclude that 

(^ 1/4 A- 1 / 6 (4 5 (2 Sl ))~ 1 / 4 ~ iV 1 /i 2 n- 1 / 4 (8 Sl iV 1 / 3 /n)- 1 / 4 ~ (S^)" 1 / 4 . 

Therefore, if N is large enough, 

Vs e [2si,+oo) \F N (Xx)\ < Aexp(-s) 

4.1.2 Case s G [s ,2si] 

Our aim now is just to show that Fn as a function of s is bounded on this interval; from this 
we shall immediately have that F^ = 0(exp(— s)) on this interval, and we will have a proof of Q. 
This p a rt is c omparatively simpler: we use equation (|7|). in which we have E _1 < 1, by definition 
( Olver! (|l975h . p.156, (5.22)). Now using the display between (6.12) and (6.13) p. 159 of the same 
article, we have for A/3 2 > 1 and v > 0, 



M(-A/3 2 /2,i;v / 2A) A/3 1 / 2 / rj \ 1/4 
(rCCl + A^ 2 )/^) 1 / 2 - (A/3 2 )Vi2 ^ 2 -/3 2 J 
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Hence, 

/ \ V 4 

|F N (A,)| < K„, N A (^h^) ^ ■ 

However on this interval, x — > 2, by A5.2 77 = (A/3 2 )~ 2 / 3 s + o((A/3 2 )" 2 / 3 ), and by A6.2 (x 2 -4lx + 
4) = 4sa N (3(l + o(l))/A. Therefore, 

2 g— . ~ 2- 2 / 3 2- 4 n/iV 
x 2 - 4lx + 4 

on the whole interval, and, because of the asymptotic estimate of K n ^ given in A4, Fn is bounded 
uniformly in N on the interval [sq, 2s\]. 
We can thus conclude that 

Vs , 3N (s ) N > iVo(so), F N (s) = O so (e" s ) on [s , oo) . 



4.2 Pointwise convergence 

Having studied in detail the uniform boundedness of makes the pointwise convergence 
problem easier. First, since we bounded above Fn in terms of M and E , equation ((BJ) shows that 
e\ = 0(A _2//3 e _s ) on [so,oo). So for fixed s, it tends to zero as N gets large. The pointwise limit 
of Fn will be the pointwise limit of the parabolic cylinder function part of the expansion. We call 
this part pF/v , for "principal part" . 

Using the relationship between U and Ai that we mention in A3, we have, with 6 = (A/3 2 ) 2 / 3 ry, 

P F N (\x) = K ntN x^ 2 - \ (Ai(9) + £- 1 (6)M(9) 0((A / 9 2 )- 1 )) . 

Since x — > 2, K n ^ ~ 2 2 / 3 (A /n) 1 / 4 and given the estimate we just mentioned for the ratio r]/{x 2 — 
41 x + 4) , we have 

/ \ 1/4 

In other respects, we show in A5. 2 that 9 — » s under AB. Finally, £~ l and Ai are bounded on 
E, as shown in 11.2 (pp.394-397) of lOlverl (|l974h . Hence £- l (9)M{9) (A/3 2 ) -1 -» under AB, and 
we can conclude that pF^(Xx) — > Ai(s); combining all the elements gives 

Vs G M, Fiv(Ax) -> Ai(s) under AB . 



4.3 Asymptotics for </> r and ip T 

So far we have shown that Fj^(z) = (— 1) N (XjJ'^ 2 ^/z<Pn(z) —* Ai(s), and that e s i ? /v was bounded 
when N > Nq and s > so- 

Our aim is to show Q and ©. Let us write, as in l.lohnstonel (|200lh . 

where 

(f>i,N(z) = {-l) N a N ^^i(t> N {z)/{V2z) = F N (z)d N (z/fx N )- 3 / 2 . 
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Study of (f>i t jsr In the previous display, we have (In = (on / (J-n) 3 ^ 2 \/o-Nn/2. As on ~ n l / 2 N 1 ' & 
and hn ~ n, (on / (J-n) 3 ^ 2 ~ n 

-3/4^-1/4 

Since a at = vNn, a^n = ■n?^ 2 N l l 2 1 and therefore 
c?Ar — > 1/V2. But when s is fixed, z/(in — > 1, so it follows that 

Ai(s) 

Under AB, 4>i,n((J-n + cats) — > — ^=- . 

v2 

To bound 0/ ^ for > A^ and s > Sq, we use, as in l.lohnstonel (|200ll l. the uniform bound for F/y 
and (z/(in)~ 3 ^ 2 < exp(-30jvs/(2/«iv)); if s > 0. If s < 0, we have (z/(jln)~ 3 ^ 2 < (I+soCat/VaO^ 2 ' 
and since this converges to 1 under AB, it is bounded if A^ is large enough. So we have shown that, 

-► 2~ 1 / 2 Ai(s) , N -> 00 , 
< Me~ s on [s , 00) if N > N (s ) . 



$I,N(jJ>N + sa N ) 



Study of <fin,N We use once again the same approach as in Ijormstonel foffllh . We have 

4>ii,n = u N v N -i4>i,N-i , 

where Un = (a n / <J n -1) \/ a A r / a A r -i and ^jy = (N/n) 1 / 2 , and nA/_i appearing in <tjv_i is n^r - 
(for (/>at-i is defined in terms of L N N _ 1 and we should therefore have the same a = n — N 
(n — 1) — (N — 1)). Remark that under AB, vn — > and uat — ► 1- 
Define s' by //at- + o^s = //at-i + fjv-is'. From 

/ MiV — /4/v-i , °"JV 
s = h 



0"2V-1 0"JV-1 

we deduce that s' > s/2 on [0, 00), if A" is large enough: as a matter of fact, under AB, //at - /42V-1 = 
0(y/n/N), on ~ n 1 / 2 A r_1//6 , and on/cn-i —> 1, so it is larger than 1/2 when A^ is large enough. 
To summarize, we just showed that 

4>ii,n{^n + sct/v) < Mvns~ s ^ 2 for s G [0, 00) , 

by applying the bound we got for cftj^N to 4>i t N-x and s' as the dummy variable. Here, we are 
implicitly using the fact that since n/N — > 00, (n — 1)/(N — 1) does too, and we can apply all 
the results we derived before. On the other hand, when s £ [sq,0], we can use the fact that 
(fj*N ~ /Aw- 1) > and fjv/fiv-i < 2 to show that s' > 2s and hence 

(/>ii,n((in + so-at) < Mt)Afe" 2s < M'-UAre"^ 2 for s G [s ,0] . 

The conclusion is therefore that 

A ( J ' N ^°°> 

4>ii,n{Vn + son) < , a -s/2 r ^ - ( w x A , / ■> 
[ < Ae A/z on [so, 00) , if N > Ao(so) . 

Hence we have shown that © and © held for <j) T . The analysis for ip T is similar. 



5 Appendices 

This section is devoted to giving background information needed to understand the problem 
and make the paper relatively self-contained. We also establish many of the properties needed in 
the course of the proofs of equations © and © here. 

Before we start, let us mention a notation issue: a changes value depending on whether we treat 
the complex case or the real one. For the complex case a + N = n, whereas for the real one 
a + N = n — 1. We frequently replace a + N by n in what follows; this is because the proof of 
equations © and © is done in the complex case and applies to the real one by just changing n into 
n — 1 everywhere. When dealing with problems which are real case specific, we keep the notation 
N + a. The definition of (in and on are also given in terms of N + a to highlight the adjustments 
needed when dealing with the real or the complex case. 
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AO: Tracy- Widom distributions 

We recall here the definition of the Tracy- Widom distributions. We split the description ac- 
cording to whether the entries of the matrix we are considering are real or complex. 
We first need to introduce the function q, defined as 

q"(x) = xq(x) + 2q 3 (x) , 
q(x) ~ Ai(x) as x —* oo . 

• Complex Case The Tracy- Widom distribution appearing in the complex case, W2, has 
cumulative distribution function F 2 given by 



F 2 {s) = exp f- J 



(x — s)q 2 (x)dx 



The joint distribution is slightly more involved to define. Following Soshnikov ( 2002^ . we do it 
through it s k- po int correlation functions, using its determinantal point process character (see e.g 
ISoshnikovl fconch ). 

Let us first call S be the Airy operator. Its kernel is 



S(x,y) 



Ai(x)Ai'(y) - Ai(y)Ai'(x) 
x-y 



Ai(x + u)Ai(y + u)du . 



In the complex case, the Appoint correlation functions have the property that 

Pk{xi, . . . ,x k ) = det S(xi,Xj) . 

l<i,j<k 

• Real Case The real counterpart of W 2 , which is called W\, has cdf F\ with 



(x) + (x — s)q 2 (x)dx 



det K(xi,Xj 
i<«,i<fc 



1/2 



Fi (s) = exp 
The /c-point correlation functions satisfy 

Pk(xi, ■ ,x k ) 

where the 2x2 matrix kernel of K has entries (see Soshnikovl ( 20021 ). eq (2.18) to (2.21)) 

1 f y 

Ki t i[x, y) = S[x, y) H — A\{x) I Ai(u)du , 



K 2 ,2(x,y) = K ltl (y,x) , 

1 d - 
Ki,2[x,y) = --Ai(x)Ai(y) - — S{x,y) , 

2 dy 



K 2 ,i(x,y) 



/ dt( Ai(v)dv ) Ai(y + t) - e(x - y) + - / Ai(u)du + - / A\(u)du I Ai( 
Jo \Jx+t J 2 J y 2 J x J~oo 



Al: Asymptotic behavior of some simple functions 

In this appendix, we present some basic facts and identities that we used throughout the proof. 
We will make repeated use of the following observations: since ajy = (y (N + a) + +y/N + ){l/ \J A^ + + 
l/ v / (iV + a) + ) 1 / 3 and A = a/2, under AB we have 

A ~ n/2 . 



We also use several times the following identities: 
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Fact 1 With A = a/2, k = N + (a + l)/2, and I = k/X, (3 = y / 2(Z - 1), we /ioue 

A/3 2 = (2N + 1) , 

The first remark is simple algebra, and the second one comes from 1 = 2(1 — 1) = 2(2N + l)/a ~ 
4N/n under AB. We have the estimates: 

Fact 2 x 2 - x x ~ S^jN/n and a N /X ~ 2n- 1 / 2 iV"- 1 / 6 . 

The second one is obvious; the first one comes from the fact that £2 — x i = 2\/2/3(Z + l) 1 / 2 as 
2:2,1 = 2/ ± 2\/^ 2 — 1- Using Fact ^ immediately gives the claimed result. Finally, we have the 
following estimates 

Fact 3 (3a N /X ~ AN l ^/n and a 3 N /(Xf3 2 ) ~ (n/N) 3 / 2 /2 . 
The result directly follows from the aforementioned estimates. 

A2: Working with t> > 

Here we assume that s £ [sq, 00). We also assume that s < 0, for otherwise we can work with 
v > /3 > 0. From Al, we have |x — X2I = \s\ajy/X < \sq\(Tn/\ <C x 2 — %i by Fact [2 Now v = 
corresponds to xo < x = (xi + x 2 )/2: as a matter of fact, since (x2 — x)(xi — x) is symmetric 
around x and 1/x is obviously larger on [xi,x] than it is on [x, x 2 ], we have 

VX (/3 2 -r 2 )dr= f X (-g(t)) 1/2 dt> H \-g{t)) l ' 2 dt . 

By symmetry, we also get 



(P 2 ~ r 2 )dr = [ (/3 2 - r 2 )dr = I / \-g{t))V 2 dt 



(3 JO 2 

< [ X (-g(t)) 1/2 dt= [ Vx (p 2 -T 2 )dT, 

and therefore, v x > 0. 

However x is always smaller than x(sq) if iV is large enough. So we can limit our investigations to 
the case v > 0. 

A3: Relationship between E -1 , M and M 

We claim that if s > 0, and we define = (A/3 2 ) 2 / 3 r/, the following inequalities hold true: 

E -1 (-A/3 2 /2, uV2A) < f-^ejCl + 0((A/3 2 )- 1 )) , 

M(-A/3 2 /2,.V2A) < ^^[r((l + A/3 2 )/2)]V^V 2 (^) V4 

x7W(0)(l + O((A/3 2 )~ 1 )) . 

For the s ake of simplic ity we call H the part that precedes the sign "x" in the last inequality. 
According to Qlverl ( 1975h . equations (5.12) and (5.13), we have 



U(-XP 2 /2,vV2X) = ~{Ai{6)+M{e)£- 1 (6)0({X(3 2 )- 1 )} , 
U(-XP 2 /2,vV2X) = Z{B[(8)+M(8)£- 1 (6)0((X[3 2 )- 1 )} . 
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We have, if s > 0, x > x%, so 

2/3/?y/ 2 = f g 1/2 (t)dt = f (t 2 " P 2 ) 1/2 dr . (8) 

For the Airy function, the weight and modulus functions had different definition depending on 
whether the argument was bigger than the largest root, c, of Ai(z) = Bi(z) or not. Likewise, the 
definition of E _1 and M depends on the position of the argument wi th respect to the largest root 
of the equation U(b,x) = U(b,x), which is called pib) in lOlverl (Il975h . 



Where do the auxiliary variables lie when s > 0? We claim that the answer is that 9 > > c, 
and vV2\ > p(-\/3 2 /2). 

The first part of equation (JSJ) implies that r] > 0, so 9 > c, as c < 0. This means that we can use 
the definition A4 2 = 2AiBi and £ ~ 1 A4 = 2 1 / 2 Ai . The second part implies that v > (3; therefore, 
2Xv 2 > 2\(3 2 > p(-\(3 2 /2) 2 , since bv lOrver] (|l975h . equation (5.21), p(b) < 2_{-b) 1 / 2 when b -oo. 
This means that we have similar relationships between E _1 , M, U, and U, to the one we had in 
the Airy case, U playing the role of Bi, and U playing the role of Ai. 

Consequences of their positions The interesting consequence of the previous remarks is that 
we can write, if N is large enough, for all s > 

-2, ^2/0 „../^_ U _M(e)£- l {6) 2-Va + 0((A/3 2 )- 1 ) 



E- 2 (-A/?72,W2A) 



U M(9)£(0) 2-V2 + ((A/3 2 )- 1 ) ' 
In other words, we just proved that 3No such that N > Nq implies, Vs > 

E-\-\p 2 /2,vV2\) <£- 1 {9)(l + 0((Xl3 2 )- 1 ) . 
By the same arguments, we derive that 

M(-\p 2 /2,vV2X) < EM(0)(1 + 0((A/3 2 )" 1 ) . 

A4: Asymptotic behavior of K n ^ 

The aim here is to show that 

K n>N ~ 2 2 l\N/n)^ . 

K n N has the following expression: 

(2A) 1 /4{ A (2 + l/2/3 2 )/e} A ( 1 +' 32 /4)^2 7r i/4 [r((1 + A/?2)/2)] 1/2^1/2 



K, 



nJV 



(Xp 2 y/ 12 VnlNla N 



Since A/3 2 = (2A^ + 1), T((l + \(3 2 )/2) = T(N + 1) = Nl . 

In other respects, let A n = {A(2 + l/2/3 2 )/e} A ( 1+/32 / 4 V Vn\ . Note that 2A + A/3 2 /2 = n - N + 
(2N + l)/2 = n + 1/2 = n+ . So A n = (n + / e) n +/ 2 / Vn\. Using Stirling's formula, we get that 
A n ~ (n+/n) n / 2 (n + /n) 1 /4( 2TO )^ 1 / 4 ~ (2vr)- 1 /4. 
Now rewriting 

^4n(A/3 2 ) 1 / 4 (87r) 1//4 



N 



(\P 2 ) 1/12 V^ ' 

we get that K n ^ ~ 2 2 / 3 (A^/n) 1//4 , from using A n \[2 and the second estimates of Fact |3] 

in Al. 
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A5: Asymptotic properties of rj 

This appendix is divided into two parts. We first show that there exists s± such that, if s > 2s±, 

-A/3V 72 > s . (PI) 
3 

Then we shall show: 

uniformly in s e [0,6], (2N + l) 2/3 r? = s + o(l) . (P2) 



A5.1: Proof of PI 

This is the argument that was used in A8 of l.Tohnstonel (^OOll l. We repeat it for the sake of 
completeness. 

Let us first suppose that s is given. Since g(x) = (x — x\)(x — x 2 )/(4;c 2 ), we have 

aN9{x) = S ~ A{x 2 + sa N /X) 2 ~ S 16A S ^X ' 

the first equivalent coming from the fact that when s is fixed, x 2 — x\ S> so - at/A, and x 2 — > 2. 
The second is just I — > 1 under AB. Now using the first point of Factum Al, together with 
a% ~ uN' 1 / 3 , we g e t that ((3a%)/(4\) -> 1. So at s fixed, 

Having this information let us now pick s\ =8. If N is large enough, we have a%g(x(si)) > s\/2 = 
4. For all (fixed) N g is an increasing function of s. Therefore for the same N we will have 

Vs > si cr 2 N g(x) > a%g(x(s 1 )) > s 1 /2 = 4 , 

and hence, since s > s% > 0, g is positive and we have g 1 ^ 2 (s) > 2/ctjv- Therefore, 

= A f > f gV\t)dt > - Sl ) = 2(s - Sl ) . 

Consequently, if s > 2s\, we have (PI). 



A5.2: Proof of P2 

Without loss of generality, we can suppose that a and b have the same sign, and a > 0. (If it 
is not the case, we can split [a, b] = [a, 0] (J[0, b], apply the reasoning on each of these, and get the 
claimed result for the original interval.) 
The idea is that on [a, b], we have 

(x - x 2 )(x 2 + for v/A - xi) , , (x - x 2 )(x 2 —X\ + a<7jv/A) 
4(x 2 + aa N /X) 2 " 9{X) ~ 4(x 2 + ba N /\) 2 ' 

Now on both sides, the terms which are not (x — x 2 ) are (x 2 — xi)(l + o(l)) = 4/3(1 + o(l)), again 
because vn /A *C (3. So if we integrate the square root of the previous inequality between x 2 and 
x(s), we get 

2/3(sa iV /A) 3 / 2 2 v ^(l +o(l))/4 > 2/3^ (3 2 > 2/3{sa N / \f' 2 2^{\ + o(l))/4 , 

or 

\s" 2 {alP/\f/ 2 {l + o(l)) > r] 3 / 2 \(3 2 > ~s 3 / 2 K/?A) 1/2 (l + o(l)) . 

The conclusion follows from Al, Fact|21 whose first point, along with the estimate of mentioned 
there, shows that ajyP/X ~ 4. We note that (P2) also gives us pointwise convergence of (A/? 2 ) 2 / 3 ?7 
to s. 
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A6: Properties of g 

We first show that g is increasing - at N fixed - as a function of s, if s > 0. Then we give an 
estimate of 4x 2 g(x) as N — ► oo and s £ [a, b]. 

A6.1: g is increasing on s > 

Since g(t) = (t- x 2 )(t - xi)/{At 2 ) = (t 2 - Alt + 4)/(4t 2 ), we have 

. / 2 It -2 

g'(t) 



t 2 t 3 t 3 ' 

Now lx 2 = 2l 2 + 2W1 2 - 1 > 2, since 1 = 1 + (2N + l)/a> 1. But Ix > lx 2 when s > 0, and the 
assertion is proved. 

A6.2: On the asymptotic behavior of 4x 2 g(x) for s € [a, b] 

This estimate is motivated by the fact that in the course of the proof of the main result, we 
have to deal with an expression of the form 

V 

x 2 - Alx + 4 ' 

We already studied in detail r\ as a function of s and N. We now focus on x 2 — Alx + 4. 
Recalling that x 2 — Alx + 4 = (x — x 2 )(x — x\) and x = x 2 + sa^/X, we have 

x 2 - Alx + 4 = s^-(x 2 - x\ + s-p) = s^(x 2 - x\ + o(/3)) , 
A A A 

because the first estimate in Fact |3] shows that <tjv/A = o(/3), and since s G [a, 6], the previous 
statement holds true uniformly on this interval. Now x 2 — X\ ~ 4/? under AB, and therefore, 
uniformly on [a, b], 

x 2 - Alx + 4 = s^4/?(l + o(l)) , 
A 

as was claimed in 14.1.21 Also, since x = x^ + sa^/A, and x^ = 2 + (2/ + 2) 1 / 2 /? + /3 2 , 

40(00 = ^9(1 + 0(1)). 

A7: Limit of 

Recall that under the notation of Johnstone! ( 200 ll ). 



where £ fc (a>) = x^^e^M (x)J 



kl 



,. We are interested in 



(fc+a)! ' 

x a/2-l e (-x/2) ( LUx) - L -_ l{x) ) dx 



(fc + a- 1)! 



/ ;u /-oo 

(by|Szegd(1975) 5.1.13 p.102) = W- : / x^^e^^ L?- 1 (x)dx 

y (fc + a - 1)1 J 



1 fc! 
(fe + a - 1) 
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xw 
1 — w 



Now using ISzeecl (|l975h 5.1.9 p.101, 

CO 

£Vl£ _1 (x) = (l-w)- a exp 

k=0 

So if F{a) = J °° (Er=o ™ kL T\x)) x a / 2 - 1 e^ x ^dx, we have: 

/>co 

F(a) = (1 - / ^/s-iel-^g-WCi-^)^ 



(1 - W y a T(a/2) 

2«/2^ ..2\-n 



2(1 - ™; 
i + w 



a/2 



«-) ~ /2 T(a/2) . 



Now if x > 0, < L n~ l {~x), by 5.1.6 in ISzegol (jl975h . and hence 



+oo 



fc=0 



+oo / I I \ 

< M k L a k -\-*) = (1 " M)~ a exp ( ) 

k=0 ^ '^'^ 



Therefore, as long as w £ (—1/3, 1/3), we can switch orders of summation, and get 

CO 

Y, wk h,a = 2 Q/2 (1 - w 2 )- a ' 2 T{a/2) . 

k=0 

But (1 — w)~ a / 2 T(a/2) = Ylt=Q r ^ a '^ +fc ' ) ^ fc ; since the right-hand side converges without any diffi- 
culty on (—1/3,1/3), and hence 

f>^ = E 2 ° /2r(a/2 + m) ^. 



fc=0 



m=0 



m! 



So we have 



Now v k>c 



Vfc e 2N, J, 



fc,o 



2 Q / 2 r((q + fc)/2) 
(fc/2)! 



' ' 4, a = 2a/2 7§=^(W- Since r ^ ~ (*A0*VV^ we have 



{k+a-iy. 



r((q + fe)/2) 

Vr(fc + a) 



(fc/2)! 



2 -(«+fc)/2( a + fc )- 1 /4 (27r) l/4^2 ) 

2 fc / 2 (vrA : )- 1 /42V4 ? 



which in turn leads to 



Ujfc)Q ~ 2 a ' 2 {k(a + fc))- 1 /4 2 -(-+fc)/2 2 ^/ 2 V2V2 
~ 2{k{a + k))- 1 ' A 
~ 2/v / Ojfc ■ 



Hence, as TV is even, \/2c0 = t> Ar, Q y / a/v/2 — > 1. 
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A8: On ISoshnikovl (l2002h Lemma 1 

For lSoshnikovl(|2002l N ) Lemma 1 to hold true in our case, we have to check two things. First that 
not only does o"jv^(/^jv + <?ns) — ► Ai(s)/v2, but also that this is true for the derivative: 



a%(j)'(fj, N + a N s) -» -j=A\'(s 



(SI) 



We also have to verify that o 2 N (\j \[In + &ns) is bounded above by A(so) exp(— As) on [sq, oo), where 
A is a positive constant. We need to verify this for ip as well, but the techniques are similar, so we 
will verify it only for cj). 

The second point that we need to check is that 



u)du il)(y + z) I dz —* as iV 



oo 



(S2) 



A8.1: Proof of ([5T|> 

It is easy to see that all we need to work on are the properties of fl'jv(s) = Fn(hn + crjys); if 
we can show that <JNF' N (fiN + &ns) — > Ai'(s), and that it is bounded by A(so)e _As on [so, oo), we 
will be done. 
We have very easily that 



—&nF' n ({xn + Cjvsi) 



00 2 d 2 F, 



N 



du . 



So the strategy is clear: we want to show that the integrand in the right-hand side is bounded 
by an integrable function and that it converges pointwise to Ai (u) = uAi(u). 
However, 



d 2 F N (x) 



dx 2 



K N 



X 2 - 1/4 

UN + CTNU ' (fl N + CTNU) 2 



+ 



and since we already know that Fjv((j,n + o'jys) 

1 



2 
(T N 



K N 



+ 



Ai(s), we first need to check that, pointwise, 
A 2 - 1/4 



4 hn + ons (/xat + ctats 



In turn, this reduces to showing that 



2 



a 



N 
fJ-N 



UN A 2 

+ — 

UN 

Kn_ 

UN 



1/4 



a4 



, and 



A 2 - 1/4 



^n 



1 . 



\ 2 /» 2 



The first result comes from the remarkable equality kn/^n - 

we have hn/^n 

2(A/ / u 7V 



N — 1/4, which follows from 
.5 - x/(l + x) 2 and A 2 / ' fx 2 N — 
2 = x/{l + x) 2 



ri-. 



the fact that if we call x = y N + /(N + a 
.25 — x/(l + x) 2 . Using these estimates, we see that kn/^n 
from which we conclude that the second result holds. 

Note that if we changed the centering and scaling (replacing n by n = n + a and N by N = N + P), 
by studying the first expression in this case as a "perturbation" of the study we just did, and using 
the fact that y,^ — /ijy = 0(y/n/N), one could show that the first expression is then 0(A r ~ 1 / 3 ), 
and so the result would hold. We also have corresponding results for the second expression. This 
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shows that we have some freedom in the centering and scaling we pick. It is also needed to show 
that 

(j 2 n 4>(ijl n + a N s) -> Ai(s) , 

since in our splitting of eft, the second part 4>II,N corresponds to parameters (n — 1,N — 1), but is 
centered and scaled using hn and cttv, defined with (n, N). 

To show that the sequence of functions we are interested in is bounded above by an integrable 
function, we split [so, oo) into [sq, y/n\ and [\/n, oo). On the first interval, we can apply the previous 
results since aws/fiN is small compared to 1. So in particular the whole integrand will be smaller 
that A(so)(l + |s|) 2 exp(— s/2), after taking into account the properties of On the other hand, 
on [\/n, oo), a 2 N < s 2 , and the denominators involving s are bigger than /ijy and ^ N respectively, 
which gives immediately that the integrand is less than A(so)s 2 exp(— s/2). From this we conclude 
that the integrand is less than A(so) exp(— s/4), for instance, and that therefore the derivative we 
are interested in is too. 

It then follows easily that (|S1|) is true, and we also showed that the left-hand side of IjSlj) is 
dominated on [sq,oo) and for N > Nq(sq) by A(so)e _S//4 . 



A8.2: Proof of (|S2j> 

The approach laid out in Soshnikov ( 2002h p. 1044 works after some modifications. We first 



write 



(u)du ip{y + z) ) dz 



,5/8 



(u)du ij)(y + z) I dz+ 



5 / 8 \Jo 



{u)du ip{y + z) ] dz 



Then we can check, y ia a third order asymptotic development in x of the right-hand side of equation 
(2.10) in Olver ( 1J380), that equation (2.18) therein is still true in our case, since, with his notations, 
%n < n~ 3 / 8 . Therefore, the analysis carried out after equation (3.21) of the same reference applies, 
and after integration of the expansion following (3.22) adapted to our situation, we can show that 



,5/8 



(u)du = 0(n- n / 16 ) 



With this estimate and this splitting of [0, oo), the rest of Soshnikov's argument holds true and 
therefore (IS2I) can be verified. 



A9: A quick look at special functions 

In this note, we mentioned three types of special functions, Airy, Whittaker, and parabolic 
cylinder functions. We recall their definition in this appendix, as well as the main ideas behind 
some of the transformations Olver used. To justify their introduction, let us say that they play a 
special role because it is possible, in the setting we were in, to write the functions we studied as a 
perturbation of the differential equations these functions satisfy. 



A9.1: Airy function 

Let us consider the following second order differential equation: 

d 2 w 

-TT = XW . 

dx z 
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General remark: Recessive solutions Since these functions are used to get asymptotic ex- 
pansions, it makes sense to define the independent solutions with respect to their behavior at +00. 
Usually, independent solutions w\ and W2 are sought, so that u>2 = o(u>i) at a particular point of 
the (extended) real line. In our cases, it will be 00. is called the recessive solution. That leaves 
the problem under deter mined, but with this in mind, one can then give enough constraints so the 
problem is fully determined, and s olve in term s of recessive and dominant solutions. For a more 
precise definition of recessivity, see Olvei ( 1974h . p. 155. 



In the case of the Airy function, we have for example: ffrom lOlven 1 19741 ). 11.1, p. 392) 

1 f°° 

Ai(x) = - / cos(i 3 /3 + xt)dt 
k Jo 
1 f°° 

Bi(x) = - / {exp(-£ 3 /3 + xt) + sin(t 3 /3 + xt)}dt 
n Jo 



A9.2: Whittaker functions 

These are solution of the following differential equation 



d 2 W 



k X 2 
- + — 



1/4 



W . 



dx 2 \4 z z* 
W K \, the recessive solution at 00, is obtained by requiring 

W K) x(x) ~ e~ x/2 x K as x -> 00 . 
The other solution is M K Xi which is required to satisfy 

M K> x(x) ~ x x+1/2 as x -> + . 
For more detail on these, see lOlver] dl974h . p.260, or lOlveTl (jl98(t . 



(9) 



A9.3: Parabolic cylinder functions 

According to lOlveil ( 1959h . equation (2.9) p. 133, parabolic cylinder functions satisfy (in the case 
we are interested in) 



1 



-x 2 + a)W . 



U(a, x) is chosen to satisfy 



d 2 W 
dx 2 



U{a,x) ~ x -«-i/2 e -^ 2 /4 ag x +oc 



On the other hand, U satisfies 

U{a,x) ~ (2/^) 1 / 2 r(l/2 - a)x a ~ 1/2 e x2/4 as x -» +00 



U' s definition i s actually fairly complicated, and can be found in lOlven (| 19591 ). equation (2.12) or 



m 



Diver] (fl975h . section 5.1. 



A9.4: On the usage of these functions 

As we mentioned earlier, these functions play a central role because it is relatively easy to 
transform the equations in which we are interested into one of the three mentioned above, or a 
perturbation of it. Then a range of techniques are available to study the effect of the perturbation, 
and one can sometimes, and obviously in the case we examine, get asymptotic expansions in terms 



22 



of the "non-perturbed" solutions. Since these functions are quite well known, information can be 
gathered about the function of o riginal interest this way. 

For example, in I.Tohnstonei fcOQlh . section 5, after the scaling ^ = x/k, the Whittaker equation © 
becomes 

d 2 w ( _ J_ 

4 £2 



K 



4? 



W . 



Using the Liouville-Green transformation C(c?CM) 2 = (£-£i)(f-6)/(4£ 2 ), with w = (d(/dS,)' 1/2 W, 
one has 

= {/c 2 C + V(C)}^ • 

This is a perturbation of the (scaled) Airy equation, for Ai(K 2 / 3 £) and Bi(/-c 2 / 3 £) are solutions of 
d 2 w/d( 2 = k 2 (w. 

w is not W, but it can be related to it, and it is through this mean that Johnstone did his original 
analysis. As ip is a relatively involved function of £ and £, we do not explicit it, but just m ention that 
the un de rstanding of ip is key to getting the uniform bound (J2J). For more on this, see I.Tohnstonei 
( 200lh or lOlverl dl974h . theorem 11.3.1 p.399. 

The problem we encountered (and mentioned in l3.1.1j) about the error control function is exactly 
here: we could not get enough information about the behavior of ip under AB, so we slightly changed 
appro ach an d turn ed to other studies. 

In lOlver (^)80), Olver starts with equation (|9*j). where the dummy variable was z. Writing 
x = z/X and I = k/X, he gets 

d 2 W / 2 . . 1 
-d^={ X9{x) -^ 



W 



As he aims to expand the solution in terms of parabolic cylinder functions, he changes variables 
another time, by writing 



W= — 



dx\ 1 / 2 

dc) 



Alx + 4 



dCY = a 

dx) 4x 2 (C 2 -/3 2 



with = {2(1 



l)} 1 / 2 . Hence, he gets 
d 2 w 

He 



{ k 2 (C 2 -P 2 )+^(k,PX)}w 



with V(k,/3,C) = -± 2 /(4x 2 ) + x 1/2 d 2 (x- 1 / 2 )/dC 2 . His lOlver] (|l 97,4 ) is a study of this type of 
equations, and in particular of the control of the devi ation o f the s olution of the previous equation 
to the corresponding para bolic cylinder function. In lOlver (1980), he studies very explicitly the 
abstract estimate he gets in lOlverl (|l97.^ in the case of Whittaker functions. We use this repeatedly 
in our study, as it is essential to get the crucial property JHJ). 



References 

TW Anderson. An introduction to multivariate statistical analysis. Wiley, 1984. 

P.J. Bickel and E. Levina. Some theory for Fisher's Linear Discriminant function, naive "Bayes", 
and some alternatives when there are many more variables than observations. Technical Report 
404, University of Michigan, Department of Statistics, 2003. 

P.A Deift. Orthogonal polynomials and random matrices: a Riemann-Hilbert Approach, volume 3 
of Courant Lecture Notes. AMS, 2000. 



23 



I. Gohberg, S. Goldberg, and N. Krupnik. Traces and determinants of linear operators, volume 116 
of Operator theory advances and applications. Birkhauser, 2000. 

R.A Horn and C.R Johnson. Matrix Analysis. Cambridge University Press, 1990. 

I.M Johnstone. On the distribution of the largest eigenvalue in principal component analysis. 
Annals of Statistics, 29(2):295-327, 2001. 

I.M Johnstone. Personnal communication, 2003. 

L. Laloux, P. Cizeau, J. -P. Bouchaud, and M. Potters. Noise dressing of financial correlation 
matrices. Physical review letters, 83(7): 1467-1470, 1999. 

M.L Mehta. Random Matrices. Academic Press, 2nd edition, 1990. 

F.W.J Olver. Uniform asymptotic expansions for Weber parabolic cylinder functions of large orders. 
J. Res. Natn. Bur. Stand., 63(B):131-169, 1959. 

F.W.J Olver. Asymptotics and Special functions. Academic Press, 1974. 

F.W.J Olver. Second-order linear differential equations with two turning points. Phil. Trans. Roy. 
Soc. London, Series A, 278:137-174, 1975. 

F. W.J Olver. Whittaker functions with both parameters large: uniform approximations in terms 
of parabolic cylinder functions,. Proc. Roy. Soc. Edinburgh, 86(A):213-234, 1980. 

M. Reed and B. Simon. Methods of modern mathematical physics. Academic Press, 1972. 

A. Soshnikov. Determinantal random point fields. Russian Mathematical Surveys, 55(5):923-975, 
2000. 

A. Soshnikov. A note on universality of the distribution of the largest eigenvalues in certain sample 
covariance matrices. Journal of Statistical Physics, 108(5/6):1033-1056, September 2002. 

G. Szego. Orthogonal Polynomials, volume 23 of AMS Colloquium Publications. AMS, 1975. 

N. Temme. Asymptotic estimates for Laguerre polynomials. Journal of Applied Mathematics and 
Physics (ZAMP), 41:114-126, 1990. 

OA Tracy and H. Widom. Level-spacing distribution and the Airy kernel. Communications in 
Mathematical Physics, 159:151-174, 1994. 

OA Tracy and H. Widom. On orthogonal and symplectic matrix ensembles. Communications in 
Mathematical Physics, 177:727-754, 1996. 

OA Tracy and H. Widom. Correlation functions, cluster functions and spacing distributions for 
random matrices. Journal of Statistical Physics, 92:809-835, 1998. 

M.E Wall, A. Rechsteiner, and L. Rocha. A practical approach to Microarray Data Analysis, 
chapter Singular Value Decomposition and Principal Component Analysis, pages 91-109. Kluwer 
Academic Publishers, 2003. 

Harold Widom. On the relation between orthogonal, symplectic and unitary matrix ensembles. 
Journal of Statistical Physics, 94:347-364, 1999. 



24 



